Search CORE

479 research outputs found

Words are Malleable: Computing Semantic Shifts in Political and Media Discourse

Author: Gallie W. B.
Jin P.
Kusner M. J.
Le Q. V.
Mikolov T.
Mikolov T.
Mikolov T.
Reese S. D.
Publication venue
Publication date: 01/01/2017
Field of study

Recently, researchers started to pay attention to the detection of temporal shifts in the meaning of words. However, most (if not all) of these approaches restricted their efforts to uncovering change over time, thus neglecting other valuable dimensions such as social or political variability. We propose an approach for detecting semantic shifts between different viewpoints--broadly defined as a set of texts that share a specific metadata feature, which can be a time-period, but also a social entity such as a political party. For each viewpoint, we learn a semantic space in which each word is represented as a low dimensional neural embedded vector. The challenge is to compare the meaning of a word in one space to its meaning in another space and measure the size of the semantic shifts. We compare the effectiveness of a measure based on optimal transformations between the two spaces with a measure based on the similarity of the neighbors of the word in the respective spaces. Our experiments demonstrate that the combination of these two performs best. We show that the semantic shifts not only occur over time, but also along different viewpoints in a short period of time. For evaluation, we demonstrate how this approach captures meaningful semantic shifts and can help improve other tasks such as the contrastive viewpoint summarization and ideology detection (measured as classification accuracy) in political texts. We also show that the two laws of semantic change which were empirically shown to hold for temporal shifts also hold for shifts across viewpoints. These laws state that frequent words are less likely to shift meaning while words with many senses are more likely to do so.Comment: In Proceedings of the 26th ACM International on Conference on Information and Knowledge Management (CIKM2017

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Joint Deep Modeling of Users and Items Using Reviews for Recommendation

Author: Bao Y.
Krizhevsky A.
Mikolov T.
Mikolov T.
Salakhutdinov R.
Tieleman T.
Van den Oord A.
Publication venue
Publication date: 17/01/2017
Field of study

A large amount of information exists in reviews written by users. This source of information has been ignored by most of the current recommender systems while it can potentially alleviate the sparsity problem and improve the quality of recommendations. In this paper, we present a deep model to learn item properties and user behaviors jointly from review text. The proposed model, named Deep Cooperative Neural Networks (DeepCoNN), consists of two parallel neural networks coupled in the last layers. One of the networks focuses on learning user behaviors exploiting reviews written by the user, and the other one learns item properties from the reviews written for the item. A shared layer is introduced on the top to couple these two networks together. The shared layer enables latent factors learned for users and items to interact with each other in a manner similar to factorization machine techniques. Experimental results demonstrate that DeepCoNN significantly outperforms all baseline recommender systems on a variety of datasets.Comment: WSDM 201

arXiv.org e-Print Archive

Crossref

Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder

Author: Kiros R.
Mikolov T.
Sutskever I.
Vosoughi S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2016
Field of study

We present Tweet2Vec, a novel method for generating general-purpose vector representation of tweets. The model learns tweet embeddings using character-level CNN-LSTM encoder-decoder. We trained our model on 3 million, randomly selected English-language tweets. The model was evaluated using two methods: tweet semantic similarity and tweet sentiment categorization, outperforming the previous state-of-the-art in both tasks. The evaluations demonstrate the power of the tweet embeddings generated by our model for various tweet categorization tasks. The vector representations generated by our model are generic, and hence can be applied to a variety of tasks. Though the model presented in this paper is trained on English-language tweets, the method presented can be used to learn tweet embeddings for different languages.Comment: SIGIR 2016, July 17-21, 2016, Pisa. Proceedings of SIGIR 2016. Pisa, Italy (2016

arXiv.org e-Print Archive

DSpace@MIT

Crossref

A Deep Siamese Network for Scene Detection in Broadcast Videos

Author: Apostolidis E.
Krizhevsky A.
Mikolov T.
Zhou B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

We present a model that automatically divides broadcast videos into coherent scenes by learning a distance measure between shots. Experiments are performed to demonstrate the effectiveness of our approach by comparing our algorithm against recent proposals for automatic scene segmentation. We also propose an improved performance measure that aims to reduce the gap between numerical evaluation and expected results, and propose and release a new benchmark dataset.Comment: ACM Multimedia 201

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

DeepWalk: Online Learning of Social Representations

Author: Al-Rfou R.
Bottou L.
Dean J.
Hinton G. E.
Kondor R. I.
Krizhevsky A.
Macskassy S. A.
Mikolov T.
Mikolov T.
Morin F.
Neville J.
Recht B.
Vishwanathan S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/06/2014
Field of study

We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs. DeepWalk uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences. We demonstrate DeepWalk's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, Flickr, and YouTube. Our results show that DeepWalk outperforms challenging baselines which are allowed a global view of the network, especially in the presence of missing information. DeepWalk's representations can provide

F_1

scores up to 10% higher than competing methods when labeled data is sparse. In some experiments, DeepWalk's representations are able to outperform all baseline methods while using 60% less training data. DeepWalk is also scalable. It is an online learning algorithm which builds useful incremental results, and is trivially parallelizable. These qualities make it suitable for a broad class of real world applications such as network classification, and anomaly detection.Comment: 10 pages, 5 figures, 4 table

arXiv.org e-Print Archive

Crossref

Fast Matrix Factorization for Online Recommendation with Implicit Feedback

Author: Levy O.
Ling G.
Marlin B. M.
Mikolov T.
Rendle S.
Sculley D.
Publication venue
Publication date: 16/08/2017
Field of study

This paper contributes improvements on both the effectiveness and efficiency of Matrix Factorization (MF) methods for implicit feedback. We highlight two critical issues of existing works. First, due to the large space of unobserved feedback, most existing works resort to assign a uniform weight to the missing data to reduce computational complexity. However, such a uniform assumption is invalid in real-world settings. Second, most methods are also designed in an offline setting and fail to keep up with the dynamic nature of online data. We address the above two issues in learning MF models from implicit feedback. We first propose to weight the missing data based on item popularity, which is more effective and flexible than the uniform-weight assumption. However, such a non-uniform weighting poses efficiency challenge in learning the model. To address this, we specifically design a new learning algorithm based on the element-wise Alternating Least Squares (eALS) technique, for efficiently optimizing a MF model with variably-weighted missing data. We exploit this efficiency to then seamlessly devise an incremental update strategy that instantly refreshes a MF model given new feedback. Through comprehensive experiments on two public datasets in both offline and online protocols, we show that our eALS method consistently outperforms state-of-the-art implicit MF methods. Our implementation is available at https://github.com/hexiangnan/sigir16-eals.Comment: 10 pages, 8 figure

arXiv.org e-Print Archive

Crossref

Can Machines Think in Radio Language?

Author: AM Turing
B Sendhoff
D Silver
S Legg
T Mikolov
V Mnih
Y LeCun
Publication venue
Publication date: 17/12/2017
Field of study

People can think in auditory, visual and tactile forms of language, so can machines principally. But is it possible for them to think in radio language? According to a first principle presented for general intelligence, i.e. the principle of language's relativity, the answer may give an exceptional solution for robot astronauts to talk with each other in space exploration.Comment: 4 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Language Models as Emotional Classifiers for Textual Conversations

Author: Mikolov T.
Vaswani A.
Yang Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/08/2020
Field of study

Emotions play a critical role in our everyday lives by altering how we perceive, process and respond to our environment. Affective computing aims to instill in computers the ability to detect and act on the emotions of human actors. A core aspect of any affective computing system is the classification of a user's emotion. In this study we present a novel methodology for classifying emotion in a conversation. At the backbone of our proposed methodology is a pre-trained Language Model (LM), which is supplemented by a Graph Convolutional Network (GCN) that propagates information over the predicate-argument structure identified in an utterance. We apply our proposed methodology on the IEMOCAP and Friends data sets, achieving state-of-the-art performance on the former and a higher accuracy on certain emotional labels on the latter. Furthermore, we examine the role context plays in our methodology by altering how much of the preceding conversation the model has access to when making a classification

arXiv.org e-Print Archive

Crossref

Information Extraction in Illicit Domains

Author: Banko M.
Bauer F.
Chakrabarti S.
Kushmerick N.
Mikolov T.
Sahlgren M.
Wick M.
Zouaq A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/03/2017
Field of study

Extracting useful entities and attribute values from illicit domains such as human trafficking is a challenging problem with the potential for widespread social impact. Such domains employ atypical language models, have `long tails' and suffer from the problem of concept drift. In this paper, we propose a lightweight, feature-agnostic Information Extraction (IE) paradigm specifically designed for such domains. Our approach uses raw, unlabeled text from an initial corpus, and a few (12-120) seed annotations per domain-specific attribute, to learn robust IE models for unobserved pages and websites. Empirically, we demonstrate that our approach can outperform feature-centric Conditional Random Field baselines by over 18\% F-Measure on five annotated sets of real-world human trafficking datasets in both low-supervision and high-supervision settings. We also show that our approach is demonstrably robust to concept drift, and can be efficiently bootstrapped even in a serial computing environment.Comment: 10 pages, ACM WWW 201

arXiv.org e-Print Archive

Crossref

Unsupervised, Efficient and Semantic Expertise Retrieval

Author: Bailey P.
Balog K.
Cao Y.
Craswell N.
Craswell N.
Davenport T. H.
Glorot X.
Hinton G. E.
Kiros R.
Maybury M. T.
Mikolov T.
Mikolov T.
Mnih A.
Mnih A.
Moreira C.
Rumelhart D.
Shaw J. A.
Sorg P.
Vapnik V.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

We introduce an unsupervised discriminative model for the task of retrieving experts in online document collections. We exclusively employ textual evidence and avoid explicit feature engineering by learning distributed word representations in an unsupervised way. We compare our model to state-of-the-art unsupervised statistical vector space and probabilistic generative approaches. Our proposed log-linear model achieves the retrieval performance levels of state-of-the-art document-centric methods with the low inference cost of so-called profile-centric approaches. It yields a statistically significant improved ranking over vector space and generative models in most cases, matching the performance of supervised methods on various benchmarks. That is, by using solely text we can do as well as methods that work with external evidence and/or relevance feedback. A contrastive analysis of rankings produced by discriminative and generative approaches shows that they have complementary strengths due to the ability of the unsupervised discriminative model to perform semantic matching.Comment: WWW2016, Proceedings of the 25th International Conference on World Wide Web. 201

arXiv.org e-Print Archive

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications